Created: 2026-03-06 07:53:04
Updated: 2026-03-06 07:53:04

假设正确概率分布是PP,猜测分布是QQ,得到NN个样本点后,得到结果是piNp_{i}N个结果ii,从而得到我们所看到的概率的结果是P=i=1NqipiNN!j=1N(pjN)!2Nipi(logpilogqi)\mathcal{P}=\prod_{i=1}^{N}q_{i}^{p_{i}N} \frac{N!}{\prod_{j=1}^{N} (p_{j}N)!}\sim 2^{-N \sum_{i}p_{i}\left(\log p_{i}- \log q_{i}\right)}

S(PQ)=ipi(logpilogqi)S(P | | Q)= \sum_{i}p_{i}(\log p_{i}- \log q_{i})

如果P=QP=Q则初始假设正确。如果错误只要NS(PQ)1NS(P | | Q)\gg 1我们就能分辨出来。

考虑一对随机变量x,yx,y,有p(xi,yj),p(xi),p(yj)p(x_{i},y_{j}),p(x_{i}),p(y_{j})。定义q(xi,yj)=p(xi)p(yj)q(x_{i},y_{j})=p(x_{i})p(y_{j}),此时我们有S(PQ)=SX+SYSXY=I(X,Y)S(P ||Q)=S_{X}+S_{Y}-S_{XY}=I(X,Y)

I(X,Y)0I(X,Y)\geq 0

取等号当且仅当xyxy不相关。

Leave a Comment

captcha
Fontsize